fix(workflows): filter user-plugin MCP noise out of workflow warnings by Wirasm · Pull Request #1327 · coleam00/Archon

Wirasm · 2026-04-21T09:00:42Z

Summary

Supersedes #1134 (closed — stale and conflicting).

The Claude SDK surfaces a system message of the form:

MCP server connection failed: telegram (disconnected), my-workflow-mcp (timeout)

whenever any MCP server in the subprocess fails to connect. Until now the dag-executor forwarded the entire message to the conversation. That meant user-level plugin MCPs inherited from ~/.claude/ (e.g. telegram, notion) — which have no connection to what the workflow author configured and routinely fail inside a headless subprocess — were shown as workflow warnings.

Fix: only surface failures for servers the workflow actually configured via mcp:.

Changes

packages/workflows/src/dag-executor.ts

Added two small file-local helpers (exported for unit testing):
- parseMcpFailureServerNames(message) — parses the SDK's formatted failure string into an ordered, deduped name list.
- loadConfiguredMcpServerNames(nodeMcpPath, cwd) — reads the node's mcp: JSON file and returns Object.keys(). Deliberately does not env-expand (the provider owns that and will surface its own parse errors via ⚠️); parse/read failures return an empty set.
executeNodeInternal pre-computes configuredMcpNames once at entry.
The msg.type === 'system' handler now has two distinct branches:
- MCP server connection failed: → split by workflow-configured vs. plugin; surface only the workflow-configured subset; debug-log the rest as dag.mcp_plugin_connection_suppressed.
- ⚠️ → unchanged (always surface — these are author-actionable, e.g. missing env vars, Haiku+MCP, invalid structured output).

packages/workflows/src/dag-executor.test.ts — 13 new tests covering both helpers (malformed input, dedup, absolute/relative paths, missing files, invalid JSON, non-object JSON).

CHANGELOG.md — entry under Fixed, credits @MrFadiAi.

Why this approach over #1134

The original PR also tried to pass settings: { enabledPlugins: {} } to the Claude SDK to disable user plugins in the subprocess. We deliberately skip that: it would change subprocess behavior globally (not just silence the warning), and the SDK's plugin-disable flag is an ergonomics concession that could mask real issues. Filtering at the display layer keeps all diagnostics available in debug logs while giving workflow authors a clean conversation. If a workflow-configured MCP really fails, the warning still surfaces verbatim.

Parsing the mcp: JSON twice (once here for names, once in the provider for full env-expanded server configs) was judged cheaper than changing the provider contract to return the names it loaded. The dag-executor read is cached per node-invocation and tolerant of any file-read failure.

Test plan

bun run validate green locally (type-check + lint + format:check + full test suite)
bun test packages/workflows/src/dag-executor.test.ts — 177 pass / 0 fail
13 new unit tests for parseMcpFailureServerNames + loadConfiguredMcpServerNames

Co-authored-by: Fadi Ai MrFadiAi@users.noreply.github.com

Summary by CodeRabbit

Bug Fixes
- Reduce MCP connection failure noise: only show failures for servers configured in the workflow; suppress plugin/unconfigured server failures.
Tests
- Added tests covering MCP server config loading and failure-message parsing/filtering.
Documentation
- Clarified MCP guidance and troubleshooting; notes that suppressed plugin failures appear only in debug logs under a specific diagnostic event.

@MrFadiAi

Before this change, the dag-executor surfaced every entry in the Claude SDK's "MCP server connection failed: …" system message to the user. That message includes user-level plugin MCPs inherited from ~/.claude/ (e.g. `telegram`) that fail to connect in the headless workflow subprocess — they're non-actionable noise for the workflow author. Fix: - Pre-compute the set of workflow-configured MCP server names per node by parsing the `mcp:` config file once at the start of executeNodeInternal. No caller-facing API change; no duplication of the provider's env-var expansion logic (we only need the keys). - Split the system-message handler: the `MCP server connection failed:` path now surfaces only the subset of failing names that match the node's configured set; user-plugin failures are debug-logged as `dag.mcp_plugin_connection_suppressed`. The `⚠️` branch is unchanged. Supersedes #1134 (closed as stale — the Windows HOME fix in that PR was already shipped via #1302, and the claude.ts enabledPlugins change targeted a file that has since moved into @archon/providers). Credits @MrFadiAi for identifying and reporting the underlying issue. Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com>

coderabbitai · 2026-04-21T09:06:08Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 008ae1bc-e77c-491a-8175-850b916b178f

📥 Commits

Reviewing files that changed from the base of the PR and between efe6918 and 7a32dde.

📒 Files selected for processing (3)

packages/docs-web/src/content/docs/guides/mcp-servers.md
packages/workflows/src/dag-executor.test.ts
packages/workflows/src/dag-executor.ts

✅ Files skipped from review due to trivial changes (1)

packages/workflows/src/dag-executor.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

packages/workflows/src/dag-executor.ts

📝 Walkthrough

Walkthrough

The PR adds parsing and config-loading helpers and updates the dag executor to load a node's mcp: config, parse SDK "MCP server connection failed" system messages, and forward only failures for servers configured by the workflow while suppressing unrelated plugin MCP failures (logged at debug).

Changes

Cohort / File(s)	Summary
Changelog / Docs `CHANGELOG.md`, `packages/docs-web/src/content/docs/guides/mcp-servers.md`	Added changelog entry and documentation clarifying that MCP connection failures are filtered to workflow-configured servers and that user-level Claude plugin MCP failures are suppressed and debug-logged.
Executor Implementation `packages/workflows/src/dag-executor.ts`	Added `McpFailureEntry` interface, `parseMcpFailureServerNames()` and `loadConfiguredMcpServerNames()` exports; modified `executeNodeInternal()` to load node MCP config, parse/filter SDK MCP failure messages to only configured servers, and debug-log suppressed plugin failures; provider `⚠️` warnings still forwarded unchanged.
Tests `packages/workflows/src/dag-executor.test.ts`	Added unit and end-to-end tests for MCP parsing, config-file loading (relative/absolute, missing/invalid JSON), deduplication and segment handling, and end-to-end behavior ensuring only configured-server failures are forwarded while plugin noise is suppressed.

Sequence Diagram(s)

sequenceDiagram
  participant Runner as Runner/Client
  participant Exec as DagExecutor
  participant FS as FileSystem
  participant SDK as SDK Stream
  participant User as Workflow User
  participant Log as Debug Logger

  Runner->>Exec: executeNode(node)
  Exec->>FS: loadConfiguredMcpServerNames(node.mcp, cwd)
  FS-->>Exec: Set[string] of configured servers
  SDK->>Exec: system message "MCP server connection failed: ..."
  Exec->>Exec: parseMcpFailureServerNames(message)
  alt matches configured servers
    Exec->>User: forward filtered MCP failure message
  else all unconfigured (plugin noise)
    Exec->>Log: dag.mcp_plugin_connection_suppressed (debug)
  end
  SDK->>Exec: system message "⚠️ ..." (provider warning)
  Exec->>User: forward provider warning verbatim

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

fix(workflows): fail loudly on SDK isError results in DAG and loop nodes #1291: Modifies packages/workflows/src/dag-executor.ts message-handling logic and related tests; closely overlaps the executor stream/message handling changes in this PR.

Poem

🐰
I parsed the noise beneath the willow tree,
Only the workflow's servers speak to me.
Plugin whispers tucked in debug's cozy den,
The executor hums — polite and zen.
Hoppity hops, clean logs again. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely summarizes the main change: filtering user-plugin MCP noise from workflow warnings.
Description check	✅ Passed	The description addresses most template sections with clear details about the problem, changes, test plan, and rationale, though some template sections are not explicitly filled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/mcp-plugin-noise-filter

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

packages/workflows/src/dag-executor.ts (1)

885-912: Preserve MCP failure details while filtering names.

filteredMsg forwards only server names, and the debug log stores only names, so the original failure reason/status is lost for both workflow-configured servers and suppressed plugins. Keep the original entry text per parsed name while still filtering unconfigured servers.

♻️ Proposed diagnostics-preserving filter

           const failedNames = parseMcpFailureServerNames(msg.content);
+          const failureEntryByName = new Map<string, string>();
+          for (const entry of msg.content.slice(MCP_FAILURE_PREFIX.length).split(', ')) {
+            const name = entry.split(' (')[0]?.trim();
+            if (name && !failureEntryByName.has(name)) {
+              failureEntryByName.set(name, entry.trim());
+            }
+          }
           const workflowFailures = failedNames.filter(n => configuredMcpNames.has(n));
           const pluginFailures = failedNames.filter(n => !configuredMcpNames.has(n));
 
           if (workflowFailures.length > 0) {
-            const filteredMsg = `${MCP_FAILURE_PREFIX}${workflowFailures.join(', ')}`;
+            const filteredMsg = `${MCP_FAILURE_PREFIX}${workflowFailures
+              .map(n => failureEntryByName.get(n) ?? n)
+              .join(', ')}`;
             getLog().warn(
               { nodeId: node.id, systemContent: filteredMsg },
               'dag.provider_warning_forwarded'
             );
@@
           if (pluginFailures.length > 0) {
             getLog().debug(
-              { nodeId: node.id, pluginFailures },
+              {
+                nodeId: node.id,
+                pluginFailures: pluginFailures.map(n => failureEntryByName.get(n) ?? n),
+              },
               'dag.mcp_plugin_connection_suppressed'
             );
           }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/dag-executor.ts` around lines 885 - 912, The current
code uses parseMcpFailureServerNames to extract only server names and loses the
original failure text; update the filtering so you keep the original parsed
entries (name+reason/status) while still splitting them into workflowFailures vs
pluginFailures by checking configuredMcpNames. Specifically, change how
failedNames is produced and filtered (references: parseMcpFailureServerNames,
configuredMcpNames) so filteredMsg (MCP_FAILURE_PREFIX + ...) and the plugin
debug log include the full original entry text per server rather than just the
bare names, and continue to call safeSendMessage and error/log paths unchanged
for workflowFailures and pluginFailures.

packages/workflows/src/dag-executor.test.ts (1)

5761-5853: Add one executor-path regression test for MCP warning filtering.

The helper tests are solid, but they do not prove executeDagWorkflow actually forwards configured MCP failures and suppresses plugin failures. A small mocked-provider test would cover the changed branch end-to-end without network or timing dependencies.

🧪 Example deterministic test coverage

+describe('executeDagWorkflow -- MCP plugin-noise filtering', () => {
+  let testDir: string;
+
+  beforeEach(async () => {
+    testDir = join(tmpdir(), `dag-mcp-filter-${Date.now()}-${Math.random().toString(36).slice(2)}`);
+    await mkdir(testDir, { recursive: true });
+    mockSendQueryDag.mockClear();
+    mockGetAgentProviderDag.mockImplementation(() => ({
+      sendQuery: mockSendQueryDag,
+      getType: () => 'claude',
+      getCapabilities: mockClaudeCapabilities,
+    }));
+  });
+
+  afterEach(async () => {
+    await rm(testDir, { recursive: true, force: true });
+  });
+
+  it('forwards only workflow-configured MCP failures', async () => {
+    await writeFile(join(testDir, 'mcp.json'), JSON.stringify({ github: { command: 'npx' } }));
+    mockSendQueryDag.mockImplementation(function* () {
+      yield {
+        type: 'system',
+        content: 'MCP server connection failed: telegram (disconnected), github (timeout)',
+      };
+      yield { type: 'result', sessionId: 'mcp-filter-session' };
+    });
+
+    const platform = createMockPlatform();
+
+    await executeDagWorkflow(
+      createMockDeps(),
+      platform,
+      'conv-mcp-filter',
+      testDir,
+      {
+        name: 'mcp-filter-test',
+        nodes: [{ id: 'review', prompt: 'Review', mcp: 'mcp.json' }],
+      },
+      makeWorkflowRun('mcp-filter-run'),
+      'claude',
+      undefined,
+      join(testDir, 'artifacts'),
+      join(testDir, 'logs'),
+      'main',
+      'docs/',
+      minimalConfig
+    );
+
+    const messages = (platform.sendMessage as ReturnType<typeof mock>).mock.calls.map(
+      (call: unknown[]) => call[1] as string
+    );
+    expect(messages.some(m => m.includes('github'))).toBe(true);
+    expect(messages.some(m => m.includes('telegram'))).toBe(false);
+  });
+});
+
 describe('parseMcpFailureServerNames', () => {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/dag-executor.test.ts` around lines 5761 - 5853, Add an
end-to-end unit test for executeDagWorkflow that verifies MCP warning filtering
by mocking a provider to emit both MCP server connection failures and
plugin-level failures, and asserting executeDagWorkflow forwards only the
configured MCP failures (using
loadConfiguredMcpServerNames/parseMcpFailureServerNames semantics) while
suppressing plugin failures; to implement, add a test that registers a fake
provider which produces deterministic log/messages, call executeDagWorkflow with
a nodeMcpPath pointing at a temp config created via the existing
loadConfiguredMcpServerNames helper, then assert the workflow result/logs
contain the MCP server names parsed by parseMcpFailureServerNames and do not
contain plugin failure entries.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/workflows/src/dag-executor.test.ts`:
- Around line 5761-5853: Add an end-to-end unit test for executeDagWorkflow that
verifies MCP warning filtering by mocking a provider to emit both MCP server
connection failures and plugin-level failures, and asserting executeDagWorkflow
forwards only the configured MCP failures (using
loadConfiguredMcpServerNames/parseMcpFailureServerNames semantics) while
suppressing plugin failures; to implement, add a test that registers a fake
provider which produces deterministic log/messages, call executeDagWorkflow with
a nodeMcpPath pointing at a temp config created via the existing
loadConfiguredMcpServerNames helper, then assert the workflow result/logs
contain the MCP server names parsed by parseMcpFailureServerNames and do not
contain plugin failure entries.

In `@packages/workflows/src/dag-executor.ts`:
- Around line 885-912: The current code uses parseMcpFailureServerNames to
extract only server names and loses the original failure text; update the
filtering so you keep the original parsed entries (name+reason/status) while
still splitting them into workflowFailures vs pluginFailures by checking
configuredMcpNames. Specifically, change how failedNames is produced and
filtered (references: parseMcpFailureServerNames, configuredMcpNames) so
filteredMsg (MCP_FAILURE_PREFIX + ...) and the plugin debug log include the full
original entry text per server rather than just the bare names, and continue to
call safeSendMessage and error/log paths unchanged for workflowFailures and
pluginFailures.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0509c545-b54b-4669-81b4-6166ac95a096

📥 Commits

Reviewing files that changed from the base of the PR and between 08de8ee and efe6918.

📒 Files selected for processing (3)

CHANGELOG.md
packages/workflows/src/dag-executor.test.ts
packages/workflows/src/dag-executor.ts

Wirasm · 2026-04-21T09:11:03Z

PR Review Summary (multi-agent)

Six specialist agents reviewed this PR. The core fix (split workflow-configured vs. plugin MCP failures) is sound and well-tested at the helper level. There is one likely-real bug, one observability gap that violates the project's fail-fast rule, and a handful of polish items.

Critical Issues (2)

Agent	Issue	Location
code-reviewer	Forwarded MCP message drops the `(timeout)` / `(disconnected)` status — `parseMcpFailureServerNames` strips status parens and the call site reconstructs from bare names, so users lose actionable detail compared to the original SDK message. Either return `{name, segment}` pairs and rejoin segments, or re-extract matching segments from the original message.	`packages/workflows/src/dag-executor.ts` (new MCP branch in `executeNodeInternal`)
silent-failure-hunter	`loadConfiguredMcpServerNames` has a bare `catch { return new Set() }` that swallows every error mode (ENOENT, EACCES, EMFILE, JSON parse error). For transient I/O (EMFILE/EBUSY) the filter can silently get an empty set while the provider's read succeeds — meaning a real workflow-MCP failure ends up classified as plugin noise and only debug-logged. Minimal fix: `} catch (err) { getLog().debug({ err, nodeMcpPath, fullPath }, 'dag.mcp_filter_config_read_failed'); return new Set(); }`.	`packages/workflows/src/dag-executor.ts:125`

Important Issues (4)

Agent	Issue	Location
code-reviewer	`MCP_FAILURE_PREFIX = 'MCP server connection failed: '` adds a trailing space the original `startsWith` guard didn't require. If the SDK ever emits the message without that space the message will silently fall into `dag.system_message_unhandled` instead of being forwarded. Either make the guard tolerant or add a regression test.	`packages/workflows/src/dag-executor.ts` (new constant + new `startsWith`)
pr-test-analyzer	Zero integration coverage of the new system-message branch end-to-end — the three behaviors that matter (mixed split, all-plugin suppression, all-workflow forwarding) are not exercised through `executeDagWorkflow`. The `else if ('⚠️')` refactor also has no test for the `system`-chunk path. Highest-priority test: mock `sendQuery` to yield a `system` chunk listing one workflow-configured server and one plugin server, assert only the workflow server reaches `platform.sendMessage`.	`packages/workflows/src/dag-executor.test.ts`
comment-analyzer	New comment in the `system` handler contains the temporal phrase "before this filter landed" — a self-reference to the PR itself that will rot immediately on merge. Keep the substance (user-level plugin MCPs from `~/.claude/` fail in headless subprocesses and must be suppressed); drop the temporal clause.	`packages/workflows/src/dag-executor.ts` (new comment block above the MCP branch)
docs-impact	`packages/docs-web/src/content/docs/guides/mcp-servers.md` ("Connection Failure Handling" section, ~lines 195–205) currently implies any MCP failure surfaces in workflow output. After this PR that's only true for workflow-configured MCPs; user-plugin failures are debug-logged. Update the section, and add a troubleshooting row for users debugging silently-suppressed plugin MCPs ("check `--verbose` for `dag.mcp_plugin_connection_suppressed`").	`packages/docs-web/src/content/docs/guides/mcp-servers.md`

Suggestions (3)

Agent	Suggestion	Location
comment-analyzer	The inline comment above `const configuredMcpNames = await loadConfiguredMcpServerNames(...)` is pure WHAT — the variable name and call already say it. Drop.	`packages/workflows/src/dag-executor.ts:538-541`
code-simplifier	Two-pass `failedNames.filter(...)` partition could be a single loop. Tiny improvement; only worth it if you prefer explicit-over-terse here.	`packages/workflows/src/dag-executor.ts:885-887`
code-simplifier	Both helpers are `export`ed solely for test access. Either accept that as a test-convenience export (add `// exported for testing` marker) or move to a `dag-executor-helpers.ts` module. Not blocking.	`packages/workflows/src/dag-executor.ts:89, :112`

Strengths

Helper-level unit coverage is thorough (13 tests covering dedup, malformed input, missing files, invalid JSON, non-object JSON, absolute/relative paths).
The loadConfiguredMcpServerNames JSDoc note about intentionally not env-expanding ("the provider owns full loading") is a textbook WHY comment — keep it as-is.
parseMcpFailureServerNames is pure, deduped, order-preserving, and tolerates malformed input by returning [].
Plugin suppressions get a named, structured debug event (dag.mcp_plugin_connection_suppressed) so they remain observable.
Per-PR-description rationale for not passing settings: { enabledPlugins: {} } to the SDK is correct — display-layer filtering is the right scope.

Verdict

NEEDS FIXES — the status-stripping bug (Critical 1) and the silent catch (Critical 2) should both be addressed before merge. The other items are quick polish or doc updates.

Recommended Actions

Preserve (status) segments through filtering so the surfaced message matches the SDK's original detail.
Add catch (err) { getLog().debug(...); return new Set(); } to make I/O failures observable.
Add at least one integration test that mocks a system-chunk MCP failure mixing a workflow server and a plugin server, and asserts only the workflow server reaches sendMessage.
Drop the "before this filter landed" phrase and the WHAT comment above configuredMcpNames.
Update mcp-servers.md to reflect the new filtering boundary.

…ervability Address review feedback on PR #1327: - parseMcpFailureServerNames now returns {name, segment} entries so the forwarded "MCP server connection failed: ..." message preserves the per-server status detail (e.g. "(timeout)", "(disconnected)") that the bare-name reconstruction was dropping. - loadConfiguredMcpServerNames now debug-logs read failures as dag.mcp_filter_config_read_failed instead of swallowing them silently. A transient EMFILE/EBUSY at filter time would otherwise silently reclassify a real workflow-MCP failure as plugin noise. - Add 4 integration tests through executeDagWorkflow covering the mixed workflow/plugin split, all-plugin suppression, no-mcp:-config nodes, and the unchanged ⚠️ warning path. - Drop a WHAT comment above configuredMcpNames and a temporal phrase ("before this filter landed") that would rot on merge. - Document the filtering boundary in guides/mcp-servers.md and add a troubleshooting row for users debugging silently-suppressed plugin MCPs.

Wirasm · 2026-04-21T09:24:35Z

Pushed 7a32dde addressing the multi-agent review:

Critical 1 (status detail): parseMcpFailureServerNames now returns {name, segment} entries; the rebuilt MCP server connection failed: ... keeps (timeout) / (disconnected) per-server.
Critical 2 (silent catch): loadConfiguredMcpServerNames's catch now binds err and emits dag.mcp_filter_config_read_failed at debug, so a transient EMFILE/EBUSY can't silently reclassify a real workflow-MCP failure as plugin noise.
Important (no integration coverage): added 4 end-to-end tests via executeDagWorkflow — mixed workflow+plugin (asserts status preserved), all-plugin suppression, no-mcp:-config node, and the unchanged ⚠️ path.
Important (rotting comments): dropped the WHAT comment above configuredMcpNames and the "before this filter landed" temporal phrase.
Important (docs): updated guides/mcp-servers.md Connection Failure Handling section + added a troubleshooting row pointing users to dag.mcp_plugin_connection_suppressed debug logs.

Skipped as YAGNI:

Trailing-space defensive code on MCP_FAILURE_PREFIX — current SDK format always has the space; defending against a hypothetical future format change with no failing test or version evidence.
Single-pass partition of failedEntries.filter(...) — micro-optimization on a list of <10 items; the two-.filter form is more readable.
Moving exported helpers to a separate dag-executor-helpers.ts module — they're test-only exports, file split would cost more than it saves.

bun run validate green locally (181/181 in dag-executor.test.ts, full suite passes).

@MrFadiAi

…h 7 (#12) * fix(workflows): make archon-adversarial-dev sed replacement macOS-safe (coleam00#1155) * fix(workflows): make adversarial init sed portable on macOS * chore: regenerate bundled-defaults after adversarial-dev sed fix Sync generated bundle with the new temp-file sed pattern in archon-adversarial-dev.yaml so check:bundled passes and binary distributions ship the macOS-safe version. --------- Co-authored-by: laplace young <yangqk12@whu.edu.cn> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com> * fix(workflows): filter user-plugin MCP noise out of workflow warnings (coleam00#1327) * fix(workflows): filter user-plugin MCP noise out of workflow warnings Before this change, the dag-executor surfaced every entry in the Claude SDK's "MCP server connection failed: …" system message to the user. That message includes user-level plugin MCPs inherited from ~/.claude/ (e.g. `telegram`) that fail to connect in the headless workflow subprocess — they're non-actionable noise for the workflow author. Fix: - Pre-compute the set of workflow-configured MCP server names per node by parsing the `mcp:` config file once at the start of executeNodeInternal. No caller-facing API change; no duplication of the provider's env-var expansion logic (we only need the keys). - Split the system-message handler: the `MCP server connection failed:` path now surfaces only the subset of failing names that match the node's configured set; user-plugin failures are debug-logged as `dag.mcp_plugin_connection_suppressed`. The `⚠️` branch is unchanged. Supersedes coleam00#1134 (closed as stale — the Windows HOME fix in that PR was already shipped via coleam00#1302, and the claude.ts enabledPlugins change targeted a file that has since moved into @archon/providers). Credits @MrFadiAi for identifying and reporting the underlying issue. Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com> * fix(workflows): preserve MCP failure status in filtered message + observability Address review feedback on PR coleam00#1327: - parseMcpFailureServerNames now returns {name, segment} entries so the forwarded "MCP server connection failed: ..." message preserves the per-server status detail (e.g. "(timeout)", "(disconnected)") that the bare-name reconstruction was dropping. - loadConfiguredMcpServerNames now debug-logs read failures as dag.mcp_filter_config_read_failed instead of swallowing them silently. A transient EMFILE/EBUSY at filter time would otherwise silently reclassify a real workflow-MCP failure as plugin noise. - Add 4 integration tests through executeDagWorkflow covering the mixed workflow/plugin split, all-plugin suppression, no-mcp:-config nodes, and the unchanged ⚠️ warning path. - Drop a WHAT comment above configuredMcpNames and a temporal phrase ("before this filter landed") that would rot on merge. - Document the filtering boundary in guides/mcp-servers.md and add a troubleshooting row for users debugging silently-suppressed plugin MCPs. --------- Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com> * test(workflows): add anyFailed status derivation coverage for DAG executor (coleam00#1403) PIV Task 1: Adds three new tests in a dedicated describe block 'executeDagWorkflow -- final status derivation' covering the anyFailed branch (dag-executor.ts ~line 2956) that previously had no direct test: - one success + one independent failure calls failWorkflowRun (not completeWorkflowRun) - multiple successes + one failure calls failWorkflowRun (not completeWorkflowRun) - trigger_rule: none_failed skips dependent node but anyFailed still marks run failed Fixes coleam00#1381. * fix(workflow): migrate piv-loop plan handoff to $ARTIFACTS_DIR (coleam00#1398) * fix(workflow): migrate piv-loop plan handoff to $ARTIFACTS_DIR (coleam00#1380) The create-plan node used a relative path (.claude/archon/plans/{slug}.plan.md) that the AI agent would sometimes write to a different location, breaking all downstream nodes that glob for the plan file. Migrated all plan/progress file references to $ARTIFACTS_DIR/plan.md and $ARTIFACTS_DIR/progress.txt, matching the pattern used by archon-fix-github-issue and other workflows. Changes: - Replace slug-based plan path with $ARTIFACTS_DIR/plan.md in create-plan node - Replace ls -t glob discovery with direct $ARTIFACTS_DIR/plan.md reads in refine-plan, code-review, and fix-feedback nodes - Replace empty-string guard with file-existence check in implement-setup bash - Migrate progress.txt references in implement loop to $ARTIFACTS_DIR/ - Add explicit plan/progress paths in finalize node - Regenerated bundled-defaults.generated.ts Fixes coleam00#1380 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(workflow): address review findings in archon-piv-loop - Rename 'Step 2: Write the Plan' to 'Step 2: Plan File Location' to eliminate the duplicate heading that collided with Step 3's identical title in the create-plan node - Guard implement-setup against a 0-task plan file: exit 1 with a clear error when no '### Task N:' sections are found, preventing a silent no-op implement loop - Remove 2>/dev/null from code-review commit so pre-commit hook failures and other stderr are visible to the agent instead of silently swallowed - Replace '|| true' on git push in finalize with an explicit WARNING echo so push failures (auth, upstream conflict, no remote) surface to the agent rather than being silently ignored - Regenerate bundled-defaults.generated.ts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(workflows): regenerate bundled defaults to match opus[1m] alias The bundle was stale relative to the YAML sources after coleam00#1395 merged — check:bundled was failing CI. Regenerated; no YAML edits. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore(workflows): switch default Opus pin to opus[1m] alias (coleam00#1395) Anthropic's Opus 4.7 landed 2026-04-16; on the Anthropic API, opus / opus[1m] now resolve to 4.7 with a 1M context window at standard pricing. Using the alias instead of the hard-pinned claude-opus-4-6[1m] lets bundled default workflows auto-track the recommended Opus version. No explicit effort is set, so nodes inherit the per-model default (xhigh on 4.7, high on 4.6). * fix(workflows): approval gate bypass after reject-with-redraft on resume (coleam00#1435) * fix(workflows): approval gate bypass after reject-with-redraft on resume When an approval node was rejected with on_reject.prompt, the synthetic PromptNode built to run the on_reject prompt reused the approval gate's own node ID. executeNodeInternal then wrote a node_completed event with that ID, causing getCompletedDagNodeOutputs to treat the gate as already completed on the next resume — bypassing the human gate entirely. Fix: give the synthetic node the ID `${node.id}:on_reject` so its node_completed event has a distinct step_name that won't match the approval gate slot in priorCompletedNodes. Adds a regression test asserting no node_completed event with the approval gate's ID is written during on_reject execution. Fixes coleam00#1429 * test(workflows): add positive assertion and SSE side-effect comment for on_reject synthetic node Add complementary positive assertion to the regression test to verify that node_completed is written exactly once with step_name 'review:on_reject', ensuring future refactors that suppress the event entirely would be caught. Add inline comment in executeApprovalNode documenting the known SSE side-effect: node_started/node_completed events with nodeId='review:on_reject' flow through the SSE pipeline into the web UI, resulting in a transient phantom node in the execution view. This is cosmetic-only — the human gate contract is preserved. * simplify: reduce duplicate cast pattern in on_reject test assertions * fix(workflows): concise failure messages for bash/script nodes (coleam00#1389) (coleam00#1393) * fix(workflows): concise failure messages for bash/script nodes (coleam00#1389) When a `bash:` or `script:` node fails, the executor was embedding `err.message` verbatim into the user-visible error. For inline scripts run via `bash -c <body>` / `bun -e <body>`, Node's `promisify(execFile)` puts the entire substituted script body into `err.message`, `err.cmd`, and the first line of `err.stack` — and the Pino log serialized all three, repeating the body ~3× in one structured log line. The actionable diagnostic was buried under kilobytes of echoed source. Route both catch-block default branches through a new `formatSubprocessFailure(err, label)` helper in `executor-shared.ts` that: - strips the `Command failed: <cmd>` prefix line (which carries the body) - prefers `err.stderr` — Bun/bash emit the actionable diagnostic there - tail-truncates at 2 KB with a `…[truncated]` marker - returns a controlled `logFields` subset (`exitCode`, `killed`, `stderrTail`) so Pino never re-serializes `err.message` / `err.stack` / `err.cmd` Also drops the script-node `stderrHint` concatenation — stderr is already handled by the helper, so the previous code appended it twice. Timeout / ENOENT / EACCES branches are preserved verbatim. Fixes coleam00#1389 * fix(workflows): address PR review feedback for coleam00#1389 - Run formatSubprocessFailure unconditionally so timeout / ENOENT / EACCES branches also get sanitized log fields (the timeout message also embeds the `Command failed: bash -c <body>` line). - Drop `errType: err.constructor.name` (always 'Error' in production) and replace with `nodeType: 'bash' | 'script'` for actual discriminating value. - Replace chained `||` + ternary in diagnostic selection with explicit if/else for readability. - Simplify exit suffix guard: `err.code != null` instead of double typeof. - Make stderrTail emptiness check explicit. - Drop `RawSubprocessError` export (no external consumer) and widen `code` to `number | string | null` to mirror Node's ExecFileException. - Tighten test length bound to <2100 (was <4000 / <2200) so a doubling of SUBPROCESS_ERROR_MAX_CHARS would actually trip the assertion. - Replace broad regex assertion with `.toContain('[eval]')` — the location marker is the strongest signal that diagnostic content survived. - Strip issue-number citations from describe block and test comments. - Update script-nodes.md to distinguish stderr behavior on success vs failure paths. - Add CHANGELOG Unreleased / Fixed entry for the user-visible change. * chore(workflows): regenerate bundled defaults after Tier 6 picks * docs: add CHANGELOG entry for Tier 6 workflow polish batch --------- Co-authored-by: CauchYoung <2024302072042@whu.edu.cn> Co-authored-by: laplace young <yangqk12@whu.edu.cn> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com> Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com> Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com> Co-authored-by: Cole Medin <cole@dynamous.ai> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: cjnprospa <sirhcle.j23@gmail.com>

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

Wirasm merged commit 6fea392 into dev Apr 22, 2026
4 checks passed

Wirasm deleted the fix/mcp-plugin-noise-filter branch April 22, 2026 08:37

Wirasm mentioned this pull request Apr 22, 2026

fix(server,web,workflows): web approval gates auto-resume + reject-with-reason dialog #1329

Merged

1 task

prospapledge88 mentioned this pull request Apr 24, 2026

chore: cherry-pick Tier 1-2 upstream fixes (14 commits) prospapledge88/Archon#6

Merged

6 tasks

coderabbitai Bot mentioned this pull request Apr 28, 2026

feat(workflows): support Codex MCP nodes #1456

Closed

prospapledge88 mentioned this pull request May 5, 2026

chore: cherry-pick upstream Tier 6 workflow polish (7 commits) — batch 7 prospapledge88/Archon#12

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(workflows): filter user-plugin MCP noise out of workflow warnings#1327

fix(workflows): filter user-plugin MCP noise out of workflow warnings#1327
Wirasm merged 2 commits intodevfrom
fix/mcp-plugin-noise-filter

Wirasm commented Apr 21, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Wirasm commented Apr 21, 2026

Uh oh!

Wirasm commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Wirasm commented Apr 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Why this approach over #1134

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Wirasm commented Apr 21, 2026

PR Review Summary (multi-agent)

Critical Issues (2)

Important Issues (4)

Suggestions (3)

Strengths

Verdict

Recommended Actions

Uh oh!

Wirasm commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Wirasm commented Apr 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading